home *** CD-ROM | disk | FTP | other *** search
- Path: aadt.sdt.com!usenet
- From: Larry Baker <leb@sdt.com>
- Newsgroups: comp.lang.c++,comp.os.ms-windows.programmer.win32
- Subject: Re: VC++ 4.0 memory allocation slower than in 2.x!!!
- Date: Mon, 01 Apr 1996 13:25:59 -0800
- Organization: SABRE Decision Technologies
- Message-ID: <316049E7.739D@sdt.com>
- References: <alanDozpsy.Kn6@netcom.com> <315C4DA0.30C@Bentley.com>
- NNTP-Posting-Host: parmail.sdt.com
- Mime-Version: 1.0
- Content-Type: text/plain; charset=iso-8859-1
- Content-Transfer-Encoding: 8bit
- X-Mailer: Mozilla 2.0 (Win16; I)
- CC: larryb
-
- [this is a little long; wade through the quotes to get the
- context, and my text starts]
-
- Philip McGraw wrote:
- > The VC 4.0 C runtime libraries use HeapAlloc and HeapFree instead of
- > the suballocation from VirtualAlloc'ed memory that was used in VC 2.x.
-
- To Robert Allen White's original request:
- > > Can anyone offer any more rational explanation for this than that
- > > Microsoft has horrible quality control? I'm having a hard time
- > > believing they could do screw up this badly.
-
- To which Michal McKinley also replied:
- > If you`re that dependant on the speed of malloc (or new since it uses
- > malloc anyway) then perhaps you shouldn`t be using any version of it as
- > none of them were built with speed in mind at all...
-
- And, which Jim Marshall commented
- > [re: 2.0 heap management scheme]
- > It fragments memory too much, and hence winds up
- > allocating to much (if your app does enough new/mallocs you could
- > cause even NT to fail with insf. memory).
-
- To the last two comments: this is a perpetual problem with malloc. The
- version that ships with IBM AIX has facilities for handling fragementation
- and pre-allocating specific sized blocks which can *dramatically*
- speed up code.
-
- C++ programs have a tendency to allocate lots of tiny chunks of memory.
- This is exceptionally hard to write *any* good memory management
- algorithm for without incurring a lot of garbage-collection-like overhead.
-
- The only really efficient way to handle it is to use some kind of
- profiling utility (or write your own) to try to come up with a
- distribution of common allocation sizes, and then pre-allocate
- blocks that can handle them. Then use a smallest-that-fits strategy
- with a quick way to look up the "pool" of pre-allocated memory blocks.
- In short, roll-your-own optimized {m,c,re}alloc.
-
- You might be surprised at how much this can speed up an otherwise
- innocuous section of C or C++ code. Everyone seems to take malloc
- and friends for granted.
-
- I checked the Win32 help on VirtualAlloc and Heap{Alloc, Create},
- and note the following:
-
- - VirtualAlloc operates directly on the virtual memory tables; hence,
- it allocates memory pages at a time, where you specify the address
- range to allocate, and has options for locking them in memory, etc.
- But explicit management of the memory provided (suballocation) must
- be done manually. To those who can check: is the 2.0 *alloc thread-
- safe?
-
- - GlobalAlloc operates on 'the heap.' It appears to be a legacy
- interface to the old Win16/Win32s interface(s). There is no mention
- of thread-safeness.
-
- - HeapAlloc, by default, performs mutual exclusion on its heap
- object argument. This incurrs an unnecessary performance penalty
- if you're a single-threaded app and can be avoided if you allocate
- per-thread heaps if your a multi-threaded app. Otherwise, it's
- necessary, either at the OS or at the runtime level. See the excerpt
- from my Win32 Online Help, below.
-
- - VirtualAlloc appears to be the low-level interface to the virtual
- memory subsystem. GlobalAlloc appears to be a general heap manager,
- and does not differentiate between the old local or global heaps.
- HeapAlloc appears to be a complete generalization of heap management,
- accounting for thread-safeness at the OS level.
-
- - In the version of the help file which I have (which is probably
- pretty dated), there's not much detail on which one you should choose.
- I speculate that if you're intrested in speed enough to write your own
- versions of malloc() and free(), you'd probably want to do it on top
- of VirtualAlloc, since it appears to be the lowest-level function. But
- bear in mind that you'll have to do your own memory map, as the
- routine expects an address range to map into virtual memory. And,
- if you're multi-threaded, you'll have to handle multiple heaps (a la
- HeapAlloc) or mutual exclusion (a la HeapAlloc).
-
- - I speculate that they changed over to Heap{Alloc, et al} due to
- their explicit thread-fastness at the OS level. I further speculate
- that Microsoft probably doesn't really care much whether you're
- running a 486SX/33 or a quad-processor 200mhz Pentium Pro.
-
- - This sort of thing can't help but bias benchmarks, as evidenced
- by last months' Byte Magazine on the subject. They discovered a
- similar performance sensitivity to malloc: sometimes (at least, with
- MS 4.x) it returns byte-aligned allocations, rather than word
- aligned. This causes roughly a 50% (20%?) performance penalty on the
- Pentium for misaligned 32-bit or greater reads, as the hardware has
- to do some monkey business for things like floats, doubles, ints...
-
- - It would be useful if someone with 4.x could check their online
- reference materials to see if they can illuminate the subject further.
-
- From the Win32 Online Help accompnaying my Borland C++ v4.52:
-
- (bein quoted text)
-
- [Per the arguments to HeapCreate, HeapAlloc et al:]
- [Search for HeapCreate]
-
- If the HEAP_NO_SERIALIZE flag is not specified (the simple default),
- the heap will serialize access within the calling process. Serialization
- ensures mutual exclusion when two or more threads attempt to simultaneously
- allocate or free blocks from the same heap. There is a small performance
- cost to serialization, but it must be used whenever multiple threads
- allocate and free memory from the same heap. Setting the
- HEAP_NO_SERIALIZE flag eliminates mutual exclusion on the heap. Without
- serialization, two or more threads that use the same heap handle might
- attempt to allocate or free memory simultaneously, likely causing corruption
- in the heap. The HEAP_NO_SERIALIZE flag can, therefore, be safely used
- only in the following situations:
-
- ╖ The process has only one thread.
- ╖ The process has multiple threads, but only one thread calls the
- heap functions for a specific heap.
- ╖ The process has multiple threads, and the application provides
- its own mechanism for mutual exclusion to a specific heap.
-
- (end quoted text)
-
- Cheers,
-
- Larry Baker
- leb@sdt.com
-